Goto

Collaborating Authors

 mutation process


Cancer-inspired Genomics Mapper Model for the Generation of Synthetic DNA Sequences with Desired Genomics Signatures

arXiv.org Artificial Intelligence

Genome data are crucial in modern medicine, offering significant potential for diagnosis and treatment. Thanks to technological advancements, many millions of healthy and diseased genomes have already been sequenced; however, obtaining the most suitable data for a specific study, and specifically for validation studies, remains challenging with respect to scale and access. Therefore, in silico genomics sequence generators have been proposed as a possible solution. However, the current generators produce inferior data using mostly shallow (stochastic) connections, detected with limited computational complexity in the training data. This means they do not take the appropriate biological relations and constraints, that originally caused the observed connections, into consideration. To address this issue, we propose cancer-inspired genomics mapper model (CGMM), that combines genetic algorithm (GA) and deep learning (DL) methods to tackle this challenge. CGMM mimics processes that generate genetic variations and mutations to transform readily available control genomes into genomes with the desired phenotypes. We demonstrate that CGMM can generate synthetic genomes of selected phenotypes such as ancestry and cancer that are indistinguishable from real genomes of such phenotypes, based on unsupervised clustering. Our results show that CGMM outperforms four current state-of-the-art genomics generators on two different tasks, suggesting that CGMM will be suitable for a wide range of purposes in genomic medicine, especially for much-needed validation studies.


Differential Evolution from Scratch in Python

#artificialintelligence

Differential evolution is a heuristic approach for the global optimisation of nonlinear and non- differentiable continuous space functions. The differential evolution algorithm belongs to a broader family of evolutionary computing algorithms. Similar to other popular direct search approaches, such as genetic algorithms and evolution strategies, the differential evolution algorithm starts with an initial population of candidate solutions. These candidate solutions are iteratively improved by introducing mutations into the population, and retaining the fittest candidate solutions that yield a lower objective function value. The differential evolution algorithm is advantageous over the aforementioned popular approaches because it can handle nonlinear and non-differentiable multi-dimensional objective functions, while requiring very few control parameters to steer the minimisation.


Semi-Supervised Learning with Trees

Neural Information Processing Systems

We describe a nonparametric Bayesian approach to generalizing from few labeled examples, guided by a larger set of unlabeled objects and the assumption of a latent tree-structure to the domain. The tree (or a distribution over trees) may be inferred using the unlabeled data. A prior over concepts generated by a mutation process on the inferred tree(s) allows efficient computation of the optimal Bayesian classification function from the labeled examples. We test our approach on eight real-world datasets.


Semi-Supervised Learning with Trees

Neural Information Processing Systems

We describe a nonparametric Bayesian approach to generalizing from few labeled examples, guided by a larger set of unlabeled objects and the assumption of a latent tree-structure to the domain. The tree (or a distribution over trees) may be inferred using the unlabeled data. A prior over concepts generated by a mutation process on the inferred tree(s) allows efficient computation of the optimal Bayesian classification function from the labeled examples. We test our approach on eight real-world datasets.


Semi-Supervised Learning with Trees

Neural Information Processing Systems

We describe a nonparametric Bayesian approach to generalizing from few labeled examples, guided by a larger set of unlabeled objects and the assumption of a latent tree-structure to the domain. The tree (or a distribution over trees) may be inferred using the unlabeled data. A prior over concepts generated by a mutation process on the inferred tree(s) allows efficient computation of the optimal Bayesian classification function fromthe labeled examples. We test our approach on eight real-world datasets.